feat(SmtForest) Skeleton out the new forest #767

iamrecursion · 2026-01-09T09:29:11Z

Describe your changes

This commit primarily exists to sketch out the interface for the new forest implementation, aimed at getting agreement on the interface and basic architecture before implementation begins in earnest. It does this directly on the LargeSmtForest type, which will be the user-facing portion of the API and likely later renamed to simply SmtForest.

As part of doing this, it includes a number of important utility types, including:

Backend, which defines the minimal set of features required for the SMT forest's pluggable backends. Backends are aware of tree semantics.
Operation, contained in SmtUpdateBatch and SmtForestUpdateBatch which define sets of modifications to be made to individual trees in the forest and the entire forest respectively.
InMemoryPrefix, which implements the in-memory portion of a tree for the backends that keep only some prefix in memory. This was originally implemented for a different approach, but the code is still valid and will be useful in the future, and so is still included.
A small set of utility types: SubtreeLevels, RootInfo, and LinearIndex. These were also implemented for a different approach but will be useful in the future.
An assortment of error types to ensure strongly-typed error handling throughout the SMT forest API.

Contains work toward the completion of #670. I have set it as "no changelog" as it is not yet intended to be part of the public API.

Checklist before requesting a review

~~Repo forked~~ and branch created from next according to naming convention.
Commit messages and codestyle follow conventions.
Relevant issues are linked in the PR description.
Tests added for new functionality.
Documentation/comments updated according to changes.

This commit primarily exists to sketch out the interface for the new forest implementation, aimed at getting agreement on the interface and basic architecture before implementation begins in earnest. It does this directly on the `LargeSmtForest` type, which will be the user-facing portion of the API and likely later renamed to simply `SmtForest`. As part of doing this, it includes a number of important utility types, including: - `Backend`, which defines the minimal set of features required for the SMT forest's pluggable backends. Backends are aware of tree semantics. - `Operation`, contained in `SmtUpdateBatch` and `SmtForestUpdateBatch` which define sets of modifications to be made to individual trees in the forest and the entire forest respectively. - `InMemoryPrefix`, which implements the in-memory portion of a tree for the backends that keep only some prefix in memory. This was originally implemented for a different approach, but the code is still valid and will be useful in the future, and so is still included. - A small set of utility types: `SubtreeLevels`, `RootInfo`, and `LinearIndex`. These were also implemented for a different approach but will be useful in the future. - An assortment of error types to ensure strongly-typed error handling throughout the SMT forest API.

iamrecursion · 2026-01-09T11:56:42Z

I wasn't sure if it made sense to include the extra bits of implementation that will be useful in the future. If you don't want them as part of this PR I can pull them out until they are needed; my main worry was just them bitrotting and ending up with a difficult merge/rebase when they do get used. Let me know!

iamrecursion · 2026-01-09T12:35:36Z

@drahnr, does the API for LargeSmtForest satisfy what you were asking for in #670?

huitseeker

This is a nice simplification compared to the Storage trait from PR #700.

The decision to make backends "aware of tree semantics" (as the module doc says) but not responsible for the actual tree logic seems like a good middle ground.

miden-crypto/src/merkle/smt/large_forest/mod.rs

huitseeker · 2026-01-09T15:01:02Z

miden-crypto/src/merkle/smt/large_forest/mod.rs

+    ///   the provided `backend`.
+    pub fn new(_backend: B) -> Result<Self, B::Error> {
+        todo!("LargeSmtForest::new")
+    }


Looking at issue #670, the design sketch shows an add_tree method for adding new independent trees to the forest:

fn add_tree(&self, leaves: LeafMap) -> Result<Root>;

I do not see an equivalent in this skeleton. How would a caller add a brand new tree (not derived from modifying an existing one) to the forest? Is the idea that you can call modify_tree on the empty tree root to create one, or is add_tree planned for a follow-up?

The plan is currently that you just modify tree on an empty root. Is add_tree needed or useful?

huitseeker · 2026-01-09T15:01:05Z

miden-crypto/src/merkle/smt/large_forest/backend.rs

+    ///   `operations`.
+    /// - If applying the provided `operations` results in no changes to a given lineage of trees in
+    ///   the forest, then no new tree must be allocated in that lineage.
+    fn modify_forest(


This new Backend trait is simpler and does not have explicit begin/commit methods. But I am curious how error recovery would work for persistent backends. If modify_forest partially succeeds (say, updates 3 of 5 trees) and then hits an error, what is the expected state? Should the trait doc clarify whether implementations must be atomic (all-or-nothing) or whether partial failures are allowed?

(this is the thrust of my comments on abort /rollback in #700, BTW — though I'm more interested in the atomicity contract in the abstract sense than how that's conveyed in the API, here)

My thinking on this is that atomicity is now an internal concept to the Backend. It was never intended to be user-facing in the previous design, just exposed to the forest itself, as I don't think we ever want to contend with user-facing rollback.

The main reason the concept exists is to prevent data corruption in the persistent backend. If something like your above scenario occurs in the persistent backend the whole transaction will be backed out.

That said, I think documenting something is probably valuable. What would you expect to happen for the in-memory backend should updating the 4th tree fail? To roll back to the previous state, or allow the partial delta? The same question applies if the backend is part-way through applying operations and fails.

I'm not really an API client here, so understanding how this should behave is important to me!

bobbinth

Looks good! Thank you! I left some comments inline - but we are pretty close.

I did not review InMemoryPrefix or subtree-related code. I'd probably remove them from this PR because this is more about aligning on the LargeSmtForest interface, and it would speed up review/merge process.

miden-crypto/src/merkle/smt/large_forest/backend.rs

bobbinth · 2026-01-09T21:44:39Z

miden-crypto/src/merkle/smt/large_forest/mod.rs

+        // We start by clearing any history for which the `version` corresponds to the latest
+        // version and hence the full tree.
+        self.backend.versions()?.for_each(|(root, v)| {
+            if v == version {
+                self.histories
+                    .get_mut(&root)
+                    .expect(
+                        "A full tree did not have a corresponding history, but is required
+        to",
+                    )
+                    .clear();
+            }
+        });


It is not clear to me why we need this. Wouldn't code on lines 354 - 357 below do the same thing?

Not currently. The history will refuse to remove the last delta as it can't know if the truncation point it is given is between the latest delta and the current version, or whether it is newer than the current version.

bobbinth · 2026-01-09T21:45:45Z

miden-crypto/src/merkle/smt/large_forest/mod.rs

+        // Then we just run through all the histories and truncate them to this version if needed,
+        // which provides the correct behaviour.
+        self.histories.values_mut().for_each(|h| {
+            h.truncate(version);
+        });


If the history becomes empty post truncation, should we remove it from self.histories map. Or is the idea to keep histories for all roots managed by the forest - even if some histories are empty?

If we do keep entries for all roots, it may be good to have a separate map which keeps track of trees with non-empty histories. This way, we won't need to iterate over all histories here (which could be expensive because there could be many millions of trees in the forest).

I had intended that we keep histories for all lineages in the forest as it simplifies history operation by not having to deallocate/reallocate histories. I had a comment stating this, but it clearly got lost in the shuffle! I've added it back, and also added a set tracking which histories actually contain deltas.

miden-crypto/src/merkle/smt/large_forest/mod.rs

miden-crypto/src/merkle/smt/large_forest/operation.rs

miden-crypto/src/merkle/smt/large_forest/root.rs

miden-crypto/src/merkle/smt/large_forest/mod.rs

miden-crypto/src/merkle/smt/large_forest/backend.rs

miden-crypto/src/merkle/smt/large_forest/history/mod.rs

miden-crypto/src/merkle/smt/large_forest/mod.rs

drahnr · 2026-01-12T08:54:13Z

@drahnr, does the API for LargeSmtForest satisfy what you were asking for in #670?

Looks good, left 👍 at the specific APIs that 'd be needed.

This commit predominantly cleans up a few elements of the forest design, but also removes some code that was originally included for later usage.

iamrecursion · 2026-01-12T11:37:24Z

I've pushed a commit that addresses the majority of comments left above. There are some open discussions in my mind with regards to particular behaviours, so if we could clarify those that would be great!

miden-crypto/src/merkle/smt/large_forest/mod.rs

bobbinth

Looks good! Thank you! I left a coupe more small comments inline.

But, I just realized that I missed one major thing: I don't think we can actually use tree roots to identify trees. This was possible in the old implementation of the SmtForest where subtrees between trees were shared. But in this implementation, it is perfectly possible for two (or more) trees to exist with the same root but actually be distinct logical trees.

The main example of this is that we have many accounts and it may so happen that two accounts end up having the same exact storage trees. This could happen at the same version or at different versions. For example, account A could have tree with root R1 at block 10, and account B could have a tree with root R1 at block 20. Or both of them could have trees with roots R2 at block 30.

The fact that there are multiple trees doesn't really matter for querying, but it does matter for updates. For example, if both accounts have the R2 tree at block 30 and account B gets updated, it's tree may become R3 at block 31 - but this should affect account As tree.

I haven't thought too much about it yet, but I think a solution could be to just pass something like context or domain together with a tree root. So, instead of saying "I want to do something with a tree with root R1" we'd be saying "I want to do something with a tree with root R1 in the context A".

I believe this should address the use cases of miden-node as we always know in the context of which account we are querying the SmtForest (but @drahnr, please confirm).

So, there are two ways to proceed:

We can merge this PR and then make a follow-up PR to fix the shortcomings.
We can fix things in this PR.

I have a slight preference for the first approach.

miden-crypto/src/merkle/smt/mod.rs

miden-crypto/src/merkle/smt/large_forest/backend.rs

miden-crypto/src/merkle/smt/large_forest/mod.rs

miden-crypto/src/merkle/smt/large_forest/backend.rs

iamrecursion requested review from bobbinth and huitseeker January 9, 2026 09:29

iamrecursion self-assigned this Jan 9, 2026

iamrecursion added the merkle Related to Merkle trees or MMRs label Jan 9, 2026

iamrecursion force-pushed the forest-skeleton branch 2 times, most recently from 8316321 to a67dff8 Compare January 9, 2026 11:23

iamrecursion added the no changelog This PR does not require an entry in the `CHANGELOG.md` file label Jan 9, 2026

iamrecursion force-pushed the forest-skeleton branch from a67dff8 to 312cb33 Compare January 9, 2026 11:30

iamrecursion marked this pull request as ready for review January 9, 2026 11:55

huitseeker reviewed Jan 9, 2026

View reviewed changes

bobbinth reviewed Jan 9, 2026

View reviewed changes

drahnr reviewed Jan 12, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/mod.rs Show resolved Hide resolved

drahnr reviewed Jan 12, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/backend.rs Outdated Show resolved Hide resolved

drahnr reviewed Jan 12, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/backend.rs Outdated Show resolved Hide resolved

drahnr reviewed Jan 12, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/history/mod.rs Show resolved Hide resolved

drahnr reviewed Jan 12, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/mod.rs Outdated Show resolved Hide resolved

Updates in response to PR feedback

59a9c3e

This commit predominantly cleans up a few elements of the forest design, but also removes some code that was originally included for later usage.

iamrecursion force-pushed the forest-skeleton branch from a700d29 to 59a9c3e Compare January 12, 2026 11:30

iamrecursion requested review from bobbinth, drahnr and huitseeker January 12, 2026 11:31

Simplify the error handling for the backend

3fcdb5b

huitseeker reviewed Jan 12, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/mod.rs Outdated Show resolved Hide resolved

miden-crypto/src/merkle/smt/large_forest/mod.rs Outdated Show resolved Hide resolved

bobbinth approved these changes Jan 13, 2026

View reviewed changes

miden-crypto/src/merkle/smt/mod.rs Outdated Show resolved Hide resolved

miden-crypto/src/merkle/smt/large_forest/backend.rs Outdated Show resolved Hide resolved

miden-crypto/src/merkle/smt/large_forest/mod.rs Outdated Show resolved Hide resolved

drahnr reviewed Jan 13, 2026

View reviewed changes

miden-crypto/src/merkle/smt/large_forest/backend.rs Show resolved Hide resolved

Final PR feedback

9402cf8

iamrecursion merged commit 4be02a0 into next Jan 13, 2026
27 checks passed

iamrecursion deleted the forest-skeleton branch January 13, 2026 10:28

feat(SmtForest) Skeleton out the new forest #767

feat(SmtForest) Skeleton out the new forest #767

Uh oh!

Conversation

iamrecursion commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Checklist before requesting a review

Uh oh!

iamrecursion commented Jan 9, 2026

Uh oh!

iamrecursion commented Jan 9, 2026

Uh oh!

huitseeker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bobbinth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drahnr commented Jan 12, 2026

Uh oh!

iamrecursion commented Jan 12, 2026

Uh oh!

Uh oh!

Uh oh!

bobbinth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

iamrecursion commented Jan 9, 2026 •

edited

Loading